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DATA RESYNCHRONIZATION CIRCUIT 

Field of the Invention 

The invention relates to delay locked loop based circuits and in particular to 
5 delay locked loop based circuits for use with an IEEE 1394-1995 decoder, IEEE Std 
1394-1995, published August 30, 1996. 
Background 

IEEE 1394-1995 decoders are based on a non return to zero (NRZ) transmission 
of data signal in which a strobe is also transmitted to recover the digital data from 

10 the NRZ data signal. From the NRZ data signal and the strobe, a recovery clock may 
be constructed which is used to extract the actual digital data from the NRZ data 
signal. The transmission of NRZ data signal and the strobe allows for a reliable 
transmission and receipt of digital data. During packet transmission, there is only a 
single node transmitting on the bus, so the entire media can operate in a half duplex 

15 mode using the two signals: Data and Strobe. As shown in Figure 1, NRZ data is 
transmitted on Data and is accompanied by the Strobe signal which changes state 
whenever two consecutive NRZ data bits are the same, ensuring that a transition 
occurs on either Data or Strobe for each data bit. Figure 2 illustrates an example of 
an IEEE 1394-1995 decoder 5. Decoder 5 receives the NRZ data signal and the strobe 

20 to generate a recovery clock using a plurality of flip flops 10, 15, and 20 to generate 
data data_l, data data_0 and a quarter clock qrt_clk, respectively. The three signals 
are then used to construct the original digital data transmitted by the source. A clock 
that transitions each bit period can be derived from the exclusive-or of Data with 
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Strobe. The primary rationale for use of this transmission code is to improve the 
skew tolerance of information to be transferred across the serial bus. 

However, the generated signals from the decoder are not useful 
because the recovered data and the recovered clock need to be in sync with the local 
5 clock of the circuit using the data. Generally, this function is performed by a data re- 
timing circuit. Previously, a phase locked loop (PLL) circuit was used for timing and 
carrier recovery to ensure optimal data sampling using a local clock. However, 
there are many disadvantages to using a PLL based circuit, in particular, in high 
speed and low power applications. For example, PLL based circuits require a long 

10 acquisition time, normally in the range of 100-2000 cycles before a "lock" takes place. 
In high speed circuits, such delay is not acceptable. To minimize the acquisition 
time, one previous method maintained a certain level of transition activity as to 
maintain a PLL lock. However, such transition activity generally resulted in power 
dissipation which in certain instances is undesirable. 

15 Accordingly, there is a need for an IEEE 1394-1995 compatible resync 

circuit that is suitable for high speed low power applications and that has a relatively 
short acquisition time. 
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SUMMARY 

In accordance with an embodiment of the invention, there is disclosed an 
apparatus including three sampling circuits to sample incoming data and a quarter 
clock. A clock generation unit is included to generate at least three sampling clocks 
5 from a local clock. Each of the three sampling clocks are configured to sample the 
incoming data and the quarter clock. A phase detector is also included to detect a 
phase difference between the quarter clock and the local clock and to generate a 
recovered quarter clock. A delay line is further included to delay the sampled 
incoming data and the recovered quarter clock by the detected phase difference. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The accompanying drawings are included to provide a further understanding 
of the invention, and are incorporated in and constitute a part of this specification. 
The drawings illustrate embodiments of the invention and, together with the 
5 description, serve to explain the principles of the invention. In the drawings, 

Figure 1 illustrates a timing diagram for Data-Strobe decoding, that is, a clock 
that transitions each bit period derived from the exclusive-or of Data with Strobe, in 
accordance with an embodiment of the invention. 

Figure 2 is a schematic diagram of an IEEE 1394-1995 Decoder in accordance 
10 with an embodiment of the invention. 

Figure 3 is a schematic diagram of system comprising of a clock generation 
(CKGEN) unit, a fine digital delay line (FDDL) unit, a coarse digital delay line 
(CDDL) unit, a phase detector (PD) unit, and three over-sampler units (OS) units in 
accordance with an embodiment of the invention. 
15 Figure 4 is a schematic diagram of the CKGEN that divides the local clock by 2 

to generate 4 equally spaced clocks, the rising edges of these clocks being used as time 
references to synchronize the operation of the entire system in accordance with an 
embodiment of the invention. 

Figure 5 illustrates OS sampling points and further decoding by the OS2 unit 
20 to determine the time location of the incoming data transition in accordance with 
an embodiment of the invention. 
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Figure 6 illustrates a state machine that issues control signals to the delay 
lines based on the current system state and the incoming data transition location in 
accordance with an embodiment of the invention. 

Figure 7 illustrates region el which corresponds to the region between the 
5 first and second sampling points, e2 which corresponds to the region between the 
second and third sampling points, and so forth in accordance with an embodiment 
of the invention. 

Figure 8 is a schematic diagram of the FDDL unit comprising four 4-to-l 
multiplexers used to select the optimal in-cycle delay data (dn) from the 4 delayed 
10 data of both data_0 and data_l paths and a dual-data port (dd) used when two data 
transitions are detected in a single cycle in accordance with an embodiment of the 
invention. 

Figure 9 is a schematic diagram of the CDDL unit comprising a multi-stage 
first-in first-out (FIFO) register array, where the outputs from the twin FIFO are 
15 combined into one using a multiplexer at the end of the FIFO in accordance with an 
embodiment of the invention. 

Figure 10 is a schematic diagram of a peripheral controller comprising a data 
resynchronization circuit, and coupled to a processor that is adapted to access data 
from the peripheral controller. 
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DETAILED DESCRIPTION 

In one aspect, the invention describes a technique to ensure safe data capture 
and resynchronization of serial data obtained from an IEEE 1394-1995 decoder to a 
local clock of a circuit using the data. In one embodiment, the invention uses a 
5 digital delay locked loop based circuit to adaptively adjust the optimal sampling 
position thereby re-synchronizing the incoming data with the local clock. 

As shown in Figure 3, a re-timing circuit 25 according to one embodiment of 
the invention, comprises three over-sampler units 30, 35, and 40, a clock generation 
(CKGEN) unit 45, a phase detector 50, a coarse digital delay line (CDDL) 55 and a fine 

10 digital delay line (FDDL) 60. Because a digital delay line is used to synchronize the 
incoming data with the lock clock, acquisition time of 4-10 cycles is possible thereby 
eliminating the need for transition activities as in a PLL circuit. Over-sampling unit 
30 is used to over-sample data_l. Conversely, over-sampling unit 35 over-samples 
data_0 and over-sampling unit 40 over-samples the quarter clock. The over- 

15 sampling clocks are provided by CKGEN 45. CKGEN 45 receives the local clock and 
generates clocks ck, dick, ckb, and dlckb. As shown in Figure 4, the four equally 
spaced clocks are phase shifted by a quarter cycle with respect to each other. In one 
embodiment, CKGEN 45 generates the four clocks by dividing the local clock by two 
using a frequency divider (e.g., flip flop). The rising edge of the first clock, ck, is then 

20 made to synchronize with a rising edge of the local clock. The rising edge of the 
second clock, dick, is synchronized with the immediate falling edge of the local 
clock. The third clock, ckb, may be generated by inverting the first clock, ck, and the 
fourth clock, dlckb, may be generated by inverting the second clock, dick. CKGEN 45 
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may comprise flip flops and inverters to generate the four equally spaced clocks in 
the manner described above. 

The generated four equally spaced clocks are used as sampling points to 
sample the data data_l, data data_0, and the quarter clock. Because the sampling 
5 points derive from the local clock, as will be apparent below, the sampled data is in 
sync with the local clock and suitable for processing by the receiving circuit. 
Although four sampling points are shown in Figure 5, more sampling points may 
be used. 

The sampling points of the over-sampling unit 40 are used to determine the 
10 phase difference between the quarter clock, qrt_clk, and the local clock, elk. 

Assuming that only two sampling points are used, it would be difficult to determine 
if the quarter clock, qrt_clk, is leading or lagging the local clock. Using three or more 
sampling points, this determination is possible and is used by phase detector 50 to 
align the local clock with the quarter clock. 
15 Figure 6 is a phase detector 65 in accordance with one embodiment of the 

invention. At the core of phase detector 65, there is a four-state state machine 70 
corresponding to the number of sampling points. Thus, if the number of sampling 
points is three, a three-state state machine would be used. The regions el-e4 of the 
state machine are the sampled points of the over-sampling unit 40 which are 
20 further decoded to determine the phase transition of the quarter clock qrt_clk. As 
shown in Figure 7, region el corresponds to the region between the first and second 
sampling points. Region e2 corresponds to the region between the second and third 
sampling points and so forth. As shown in Figure 6, with the use of flip flops 80 and 
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83, qO and ql, which represent dataJ3 and data_l of the current state, are being input 
into four-state state machine 70 along with a receiving input from regions el-e4. 
Thus, depending on the detected phase transition and the current state of machine 
70, phase detector 65 will transmit various control signals. Control signals including 
5 shift left (SL), shift right (SR), and dual data enable (DDE) control the coarse digital 
delay line (CDDL). The single data select (S) and dual data select (T), which are 
obtained using a plurality of flip-flops 75, control the fine digital delay line (FDDL). 

The operation of phase detector 65 is as follows: Assuming initially, a clock 
transition of the quarter clock qrt_clk occurs between the second and third sampling 

10 points, the region between the two will be decoded as e2 which is inputted into state 
machine 70. A data select (S) is transmitted to the fine digital delay line (FDDL) 85. 
As will be described further below, FDDL 85 controls the phase difference within the 
local clock cycle (in-cycle). Thus, if the phase error is more than one local clock cycle, 
coarse digital delay line (CDDL) 110 is used to compensate for the multi-local clock 

15 cycle phase difference. SL or SR signals are transmitted to CDDL 110 if the quarter 
clock transition occurs before region el or after region e4 respectively. Assuming 
that state machine 70 is at state 2, which reflects the inputted region e2, and the next 
sampling round shows that the quarter clock transition is occurring at region el, this 
indicates that the quarter clock is leading. The state machine transmits the 

20 appropriate signals to FDDL 85 to compensate for the phase difference. State 

machine 70 appropriately updates its state to state 1 reflecting the inputted region el. 
If a subsequent sampling round shows that the quarter clock transition is in region 
e4, then state machine 70 will recognize that the phase difference is multi-local clock 
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cycle phase difference. Having detected a single data transition to FDDL 85, state 
machine 70 will transmit an SL signal to CDDL 110 and an S signal while updating 
the state machine to state 4. 

Figure 8 illustrates a fine digital delay line (FDDL) 85 in accordance with one 
5 embodiment of the invention. FDDL 85 comprises a crossbar structure of 4-to-l 
multiplexers 90, 95, 100, and 105 to select the optimal in-cycle delay of data_0 and 
data_l. The operation is as follows: Assuming that signal e2 has been inputted into 
state machine 70, state machine 70 transmits an S signal to the second control line of 
multiplexers 90 and 95. This causes the second sampled data point of oversampling 

10 unit 30 and oversampling unit 35 which are in sync with the local clock, to be 

selected and passed through. In addition, a dual-data port (dd) is designed to cover 
the situation when two data transitions are detected in a single cycle. These 
multiplexers 90, 95, 100, and 105 are controlled by S and T signals from phase 
detector unit 65. Of course, this is but one embodiment of a delay line and other 

15 delay lines may be used to perform this function. 

Figure 9 illustrates a coarse digital delay line (CDDL) 110 in accordance with 
an embodiment of the invention. CDDL 110 comprises a plurality of first-in first- 
out (FIFO) registers 115 where each register is equivalent to one local clock cycle 
delay. CDDL unit 110 comprises of a twin 7-stage first-in-first-out (FIFO) register 

20 array to cover a ± 6 cycle delay adjustment range, in order to account for a possible 6 
bit-error in the IEEE 1394-1995 decoder. The input-to-output delay adjustment of 
CDDL 110 is done through controlling the data injection pointer 125 along FIFO 115. 
Initially, the data injection pointer 125 is pre-set to the center of the FIFO array 115 
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and then adaptively controlled by the shift left (SL) and shift right (SR) signals from 
phase detector 65. The operation is as follows: Assuming phase detector 65 detects a 
clock transition in region e4 from the previous clock transition in region el, phase 
detector 65 will recognize that the data transition is now occurring out of cycle. In 
5 this instance, state machine 70 in Figure 6 transmits an S signal to the fourth control 
line of the multiplexers in FDDL 85 and also a shift-left (SL) signal to FIFO registers 
115 of CDDL 110. On receipt of the SL signal, CDDL 110 shifts left one bit delaying 
the data by one cycle to compensate for one cycle lead of the quarter clock over the 
local clock. The design allows up to two sets of data (dn and dd) to be injected into 

10 FIFO 115 simultaneously to cover non-, single-, and dual-data receiving in a single 
local clock cycle, as resulted from the time variation of the input data. Finally, the 
outputs from the twin FIFO 115 are combined into one using a multiplexer 120 at 
the end of FIFO 115 before sending out the sync data. 

Figure 10 is a schematic diagram that illustrates a system 130 wherein a 

15 peripheral controller 150 comprises a data resynchronization circuit 155. Peripheral 
controller 150 is coupled to processor 140 via a serial or parallel bus 145. Processor 
140 is adapted to access data from peripheral controller 140 via bus 145. 

Memory 135, and display controller 160, may also be coupled to peripheral 
controller 150 via bus 145. Monitor 165 may also be coupled to display controller 

20 160. Other peripheral devices 170, such as a mouse, CD-ROM and video, may also be 
coupled to peripheral controller 145. 
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Figure 10 illustrates but one application of the invention, that is the personal 
computer, but may be used with other applications such as a work station, server, 
Internet driver or other fabric channels. 

Compared to the analog delay locked-loop (DLL) synchronization approaches, 
5 the digital delay locked-loop (DLL) solution described in the invention is very 
suitable for system integration using advanced digital processes technology and 
design environment. Other advantages of this invention include: a full digital 
circuit implementation using highly reusable blocks for shorter development time, 
lower development cost, and higher manufacture yield; a twin-pipe (data_0 and 

10 data_l) architecture doubling the throughput of the data path and consequently 
allowing the core logic to operate at half of the core frequency; a scaleable 
architecture allowing extension of a locked-in range by simply increasing the delay 
line stage. Although the current circuit is implemented for IEEE 1394-1995 data 
communication, the technique described in this invention can also be used for most 

15 other data communication systems, such as a Community Access Television 

(CATV) network, the Public Switched Telephone Network (PSTN), the Integrated 
Services Digital Network (ISDN), the Internet, a local area network (LAN), a wide 
area network (WAN), over a wireless communications network, or over an 
asynchronous transfer mode (ATM) network. 

20 In the foregoing specification, the invention has been described with 

reference to specific embodiments thereof. It will, however, be evident that various 
embodiments and changes can be made thereto without departing from the broader 
spirit and scope of the invention as set forth in the appended claims. The 
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specification and drawings are, accordingly, to be regarded in an illustrative rather 
than a restrictive sense. 
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